fix(cluster): recover sharded pubsub topology after node reconnects#3223
Open
PavelPashov wants to merge 10 commits into
Open
fix(cluster): recover sharded pubsub topology after node reconnects#3223PavelPashov wants to merge 10 commits into
PavelPashov wants to merge 10 commits into
Conversation
3cb7843 to
8281293
Compare
nkaradzhov
reviewed
May 7, 2026
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Reviewed by Cursor Bugbot for commit 07c5cf1. Configure here.
07c5cf1 to
aae8c63
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Description
Checklist
npm testpass with this change (including linting)?Note
Medium Risk
Adds automatic background rediscovery triggered by node reconnection events, which changes cluster slot discovery behavior and could cause extra topology refreshes or missed exclusions under reconnection storms.
Overview
Improves cluster recovery by tracking post-ready node reconnection attempts and scheduling a de-duplicated background topology refresh once a configurable delay elapses, while excluding currently reconnecting node addresses from the refresh attempt.
Introduces
ClusterReconnectionTracker(with unit tests) and wires it intoRedisClusterSlotslifecycle/events (reconnecting/ready/end, node removal, unsubscribe, destroy) to maintain accurate reconnecting state.Extends
RedisClusterSlots.rediscoverto optionally start from known nodes (before falling back to root nodes) and adds a new cluster optiontopologyRefreshOnReconnectionAttemptStrategyto control the refresh delay/behavior. Updates sharded pubsub E2E tests to use sharedtestUtilsharness and exercise multiple failure scenarios (failover/node/proxy/shard) plus unsubscribe behavior.Reviewed by Cursor Bugbot for commit aae8c63. Bugbot is set up for automated code reviews on this repo. Configure here.